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Abstract 

The transition between the Galactic and extragalactic cosmic ray components could 
take place either in the region of the spectrum known as the second knee or in the 
ankle. There are several models of the transition but it is not possible to confirm or 
even rule out any of them from the flux measurement alone. Therefore, the measure- 
ment of the composition as a function of primary energy will play a fundamental 
role for the understanding of this phenomenon. 

In this work we study the possibility of primary identification in an event by 
event basis in the ankle region, around E = 10 18 eV. We consider as case study 
the enhancements of the Pierre Auger Southern Observatory, which are under con- 
struction in Malagiie, Province of Mendoza, Argentina. We use a non-parametric 
technique to estimate the density functions, from Monte Carlo data, corresponding 
to different combination of mass sensitive parameters and type of primaries. These 
estimates are used to obtain the classification probability of protons and iron nuclei 
for the different combination of parameters considered. We find that, after con- 
sidering all relevant fluctuations, the maximum classification probability obtained 
combining surface and fluorescence detectors parameters is of order of 90%. 
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1 Introduction 



The cosmic ray energy spectrum presents three main features observed by 
several experiments, the knee, observed at around 3 — 5 x 10 15 eV [T||2T3] , the 
second knee, observed at 4 x 10 17 eV [4l5l[6l[7] and the ankle. There is evidence 
of a fourth feature situated at the end of the spectrum, the so-called GZK 
suppression [8f9] . which would be originated by the interaction of the ultra- 
high energy protons with the photons of the cosmic microwave background 
radiation [TUyTTj . For the case of heavier nuclei a similar effect is expected 
because of their interaction with photons from the infrared and microwave 
backgrounds [T2] . 

The origin of the second knee is still unclear, it has been interpreted as the end 
of the efficiency of the acceleration in Galactic supernova remnant shock waves, 
a change in the diffusion regime in our galaxy [T3"][T4"] or even the transition 
between the Galactic and extragalactic components of the cosmic rays |15j . 

The ankle is a broader feature, it has been observed by Fly's Eye [5], Haverah 
Park (TJ5], Yakutsk [5J, HiRes [7] and Auger [8] in Hybrid mode at approxi- 
mately the same energy, ~ 3 x 10 18 eV. AGASA also observed the ankle but 
at a higher energy, ~ 10 19 eV [17J. The origin of the ankle is also unknown, 
it can be interpreted as the transition between the Galactic and extragalactic 
components [18] or the result of pair production by extragalactic protons after 
the interaction with photons of the cosmic microwave background radiation 
during propagation |15j . 

There are three main models of the Galactic-extragalactic transition: (i) the 
mixed composition model [18], in which the extragalactic sources inject a 
spectrum of masses of the form of the corresponding to the low energy Galactic 
cosmic rays and for which the transition take place in the ankle, (ii) the dip 
model [15], in which the ankle is originated by pair production of extragalactic 
protons that interact with the photons of the cosmic microwave background 
radiation, in this scenario the transition is given at the second knee and (iii) 
the ankle model, a two-component transition from Galactic iron nuclei to 
extragalactic protons at the ankle energy [T5] . 

In order to rule out, or even confirm any of those models, additional informa- 
tion is necessary besides the energy spectrum shape and absolute intensity. 
Detailed measurements of the composition as a function of energy would be 
extremely valuable to break the present degeneracy among competing models 
for the Galactic-extragalactic transition [20|2T] . Furthermore, this kind of in- 
formation could help to determine what are the highest energy accelerators in 
the Galaxy and provide indicatives of the kind and level of magnetohydrody- 
namic turbulence present in the intergalactic medium traversed by the lowest 



2 



energy cosmic ray particles [22]. 

Several experiments have measured the cosmic ray composition in the re- 
gion where the transition takes place. Nevertheless, large discrepancies exist 
between different experiments and experimental techniques [23J. One of the 
main reasons behind the plurality of sometimes contradicting results is that 
the composition is determined by comparing experimental data against nu- 
merical shower simulations. These simulations include models for the relevant 
hadronic interactions which are extrapolations, over several orders of magni- 
tude in center of mass energy, of accelerator data to cosmic ray energies. No 
doubt, this is a source of considerable uncertainties which are confirmed, to a 
certain extent, by the fact that there are experimental evidences of a deficit 
in muon content of simulated showers with respect to real data [24]. 

Several multi-parametric techniques for the mass identification of individual 
showers have been studied and discussed in the literature. The most popu- 
lar include non-parametric density estimates [25*26.27] and neural networks 
[25][28] . In particular, in this work we use a non-parametric density estimate 
technique of Gaussian kernel superposition, improved by adaptive choice of 
the smoothing parameters, in order to estimate the identification probability 
of individual nuclei, under the assumption of a binary proton and iron sample. 
The corresponding formalism, developed here, is general and applies to a sam- 
ple of any number components. However, even if the application is formally 
straightforward, a wide range of theoretical and experimental uncertainties 
renders, at present, the extension to more than two components dubious. 

As case study, the mass sensitive parameters that will be available from the en- 
hancements of the Pierre Auger Southern Observatory are considered. Auger, 
in its original design is able to measure cosmic rays of energies above 3 x 10 18 
eV with the surface array and < 10 18 eV in hybrid mode. The enhance- 
ments AMIGA (Auger Muons and Infill for the Ground Array) [2H] and HEAT 
(High Elevation Auger Telescopes) [30] , will extend the energy range down to 
10 17 eV, encompassing the second knee and ankle region where the Galactic- 
extragalactic transition takes place. 

AMIGA will consist of 85 pairs of Cherenkov detectors and muon counters of 
30 m 2 plastic scintillators buried at ~ 2.5 m of depth. These pairs constitute 
the AMIGA infills, which are bounded by two hexagons of 5.9 and 23.5 km 2 
corresponding to arrays of 433 m and 750 m spacing, respectively. The energies 
at which the AMIGA arrays attain full detection efficiency independently of 
the primary mass, are ~ 10 17 eV and ~ 10 17 ' 6 eV for the 433 m and 750 m 
arrays respectively [31] . On the other hand, HEAT consists of three additional 
telescopes with elevation angle ranging from 30° to 58° and located next to the 
fluorescence telescope building at Coihueco. They will be used in combination 
with the existing 3° to 30° elevation angle telescopes at that site, as well as in 
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hybrid mode in conjunction with the AMIGA infills. 

The paper is organized as follow: in section [2] we introduce the classification 
technique used for the subsequent analyses together with an analytical exam- 
ple of application of this technique. In section [3] we describe the full Monte 
Carlo simulations used in the calculation of the classification probabilities. In 
section H] we present the non-parametric methods used in the calculation of the 
classification probabilities and the results obtained for different combination 
of surface and fluorescence mass sensitive parameters. In section [5] we estimate 
the uncertainty in the determination of the composition of a sample obtained 
by using the classification technique introduced earlier. Finally, the discussion 
and conclusions are presented in section [HI 



2 Classification technique 



The event by event classification consists in determining the type of nucleus 
that originated a given event. This is usually done comparing the experimental 
data with simulated data. For this purpose non-parametric techniques like 
Bayes classifiers are very useful tools [251126] . 

From the experimental data or even from simulations, several observable pa- 
rameters, chosen such that they are very sensitive to the chemical composition 
of the primary, can be obtained. Let x be a d- dimensional vector composed 
by the mass sensitive parameters considered, S c i = {Ai,...,A L } the set of 
classes in which the the events will be classified (S c i = {Pr, Fe} in this work) 
and P(x \Ai) the conditional density function for the class Ai. Therefore, the 
probability of Ai given x can be obtained by using the Bayes theorem [32] . 

where p(Aj) gives the prior knowledge about the relative abundances of each 
class. In absence of any prior knowledge the prior probability distribution is 
assumed to be uniform, 

1 L 
p(Ai) = - such that XM^i) = L ( 2 ) 

Li - i 



The classification of the event is given by the class with maximum probability, 

A*(x) = argmax [P(A|f)j , (3) 

Aes cl 

where A*(x) is the class assigned to the event with parameters x. 
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In this work, we consider the classification into proton and iron nuclei, S c i = 
{Pr, Fe}. Therefore, in this particular case, if P(Pr\x) > 1/2 the event with 
parameters x is classified as proton, otherwise iron. 

Therefore, the distribution functions P(x \Pr) and P(x \Fe) are required 
to classify the experimental data into proton and iron nuclei. Estimators of 
these distribution functions are obtained by using a non-parametric method 
of superposition of Gaussian kernels using the data obtained from the detailed 
simulation of the showers, including the response of the detectors and taking 
into account the effects introduced by the reconstruction methods (see section 

H. 

A simplified one-dimensional analytical example is introduced in the follow- 
ing paragraphs in order to better understand the results obtained considering 
parameter spaces of more than one dimension and using non-parametric tech- 
niques to estimate the distribution functions from Monte Carlo data. This 
particular example presents similar features to those found in the complete 
and more sophisticated analysis. 

Let us consider the gamma distribution, 

a (a x\ Q_1 / ax 



where a and a are parameters. It is easy to show that the mean value and 
the variance of this distribution are given by E[x] = o and Varfx] = a 2 /a, 
respectively. 

Two classes, S c i = {a, b}, are considered. The corresponding distribution func- 
tions are P{x \a) = f(x; a, a a ) and P{x \b) = f(x; a, where the parameter 
a is the same for both in order to simplify the calculations. Therefore, as- 
suming a uniform distribution for the prior probabilities, the probability of a 
given x is, 

p( a U) = P ( x l Q ) = f(x;a,<r a ) 

P(x | a) + P(x \b) f(x; a, a a ) + f(x; a, a b ) ' 



If x is a random variable distributed according to P(x \a) or P(x \b), 

_ f(x;a,a a ) , , 

Pa r/ \ , ft \1 l D / 

is also a random variable. Therefore, the distribution functions of p a for the 
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cases in which x is distributed according to f(x; a, a a ) and f(x; a, a b ) are, 



Ps(Pa) 




(7) 



where s = a,b,I p = [1/(1 + (cr a /a b ) a ), 1) if a a > a b and I p = (0, 1/(1 + (a a /a b ) a )] 
if a a < o b and p a can be obtained inverting Eq. (jBJ), 



Ob - Oa 



In 



0V, 



The distribution functions of p a depend on how separated, relative to their 
width, are f(x; a, a a ) and f(x; a, a b ). A measure of this separation is given by 
the parameter, 



\E[x a ] -E[x b }\ 



i = yen — r 

^Var[x a ] + Var[x b ] ^, 



CTb 



crl + a 2 b 



(9) 



which is larger for distributions that are more separated (relative to their 
width). 

Figured] shows P a (Pa) and P b (p a ) and the corresponding distributions fix; a, a a ) 
and f(x; a, a b ) for three different values of 77. The values of i] are obtained by 
fixing a a = 50 and a b = 33 and changing a. 

From figure [T]it is seen that as 77 increases, the distribution functions P a (Pa) 
and P b (pa) concentrate more around p a — 1 and p a = 0, respectively, i.e., the 
classification between classes a and b get better. P a (Pa) and P b (Pa) are not 
symmetric, this is due to the different values of the standard deviations of 
f(x;a,a a ) and f(x;a,a b ). Moreover, for the different values of 77 considered 
P a {Pa) is more concentrated around p a — 1 than P b {p a ) around p a = 0, this is 
because a a > cr b . 

The classification probabilities for classes a and b are given by, 



P: i = P a {p a >l/2)= f 1 d Pa P a ( Pa ) 

Jl/2 

P b d = P b {p a < 1/2) = f 1/2 d Pa P b (p a 

JO 



(10) 



11^ 



and the classification probability for both classes is P c \ = (P£ + P^)/2. Figure 
[2] shows the classification probability as a function of rj for classes a, b and 
both together. As expected, they increase with r] getting closer to one. 
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0.06 




Fig. 1. Left panels show P a (p a ) and Pb(p a ) as a function of p a for three different val- 
ues of rj and right panels show the corresponding distributions functions f(x;a,a a ) 
and f(x;a,ab)- Dashed lines correspond to class a and solid lines to class b. The 
parameters a a = 50 and = 33 are fixed and a is varied to obtain the different 
values of rj considered. 



The classification probability, for the particular distributions considered in this 
example, is larger for the distribution with smaller standard deviation, this is 
not general, it depends on the shape of the distribution functions considered. 
To better understand this fact the classification probabilities can be rewritten 
in the following way, 
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Fig. 2. Classification probability of classes a, b and a and b together as a function 
of rj. 



pa . 



dx Q(f(x; a, a a ) - f(x; a, a b ))f(x; a, a a ] 

oo 

dx f(x;a,a a ), 



dx Q{f{x; a, a b ) - f(x; a, a a ))f(x; a, a b ) 

XQ 

dx f(x;a,a b ), 



(13) 



where Q(x) = 1 for x > and 0(a;) = otherwise and xq is the solution of 
f(xo]a,a a ) = f(xo;a,a b ). The classification probability for a given class is 
the integral of its distribution function over the interval for which it is grater 
than the distribution function of the other class. Therefore, from Eqs. ffT2~l) 
and ([TBI it can be seen that although it is not general that, if the standard 
deviation of the distribution of class a is grater than the corresponding to class 
b then P£ < P^, it happens for many different kind of distribution functions, in 
particular, it is true for the ones considered in this work and also for Gaussian 
distributions. 
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3 Simulations 



3.1 Optimum energy bin 



The primary energy estimated from the experimental data given by an array 
of Cherenkov detectors is obtained by fitting a lateral distribution function to 
the total signal in each station. This allows to interpolate the shower signal at a 
fixed distance from the core which, in turn, is used as an energy estimator. This 
reference distance is such that the shower fluctuations go through a minimum 
in its vicinity, and its exact value depends on the geometry of the array; for 
Auger, the reference distance is 1000 m for the 1500 m baseline spacing and 
600 m for the AMIGA infill of 750 m of spacing. 

The signal at the reference distance is calibrated with the fluorescence tele- 
scopes via hybrid events. The corresponding energy uncertainty for the 1500 
m-array of Auger is ~ 18% [33]. Guided by this experimental result, we assume 
in this work a 25% Gaussian energy uncertainty. 

The study of the classification probability at energies of order of E = 10 18 
eV requires the determination of the energy interval of reconstructed energies, 
centered at E r0 , defined as n r = [(1 - 5)E r0 , (1 + 5)E rQ ] (5 = 0.25 for 25% 
of energy uncertainty) such that the fraction of events of the interval Ho = 
[(1— 5)Eq, (1+5)Eo] is maximum. The intervals Il r and Ho are different because 
of the spectrum, the contamination of events of real energies smaller than 10 18 
eV which are outside n is grater than the one corresponding to energies, also 
outside n , but above 10 18 eV. We follow the procedure introduced in Ref. 
to determine IL which is detailed bellow. 



The number of showers with real energies between E and E + dE is given by, 

— (E) = No ( 7 - 1) j h2 , E~\ (14) 
dE \ ) o w ! El' 1 -El' 1 V ' 

where \Ei,Eq\ = [0.1, 10] x 10 18 eV is the region of the spectrum considered, 
No is the number of events in the interval [Ei, E%] and 7 = 2.7. 

The number of events whose real energies belong to Ilo, such that the recon- 
structed energies fall in Il r is, 

r(l+S)E r0 r(l+S)E fiN 

f A {E r o) = dE dE' -j-(E') G(E, E ), (15) 

J(l-5)E r0 J(1-S)E dE 1 



where G(E, E') = exp[-(E - E'f /(25 2 E' 2 )]/(V^ 5 E'). 

On the other hand, the number of events with real energies outside Ilo whose 
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17.8 17.9 18 18.1 18.2 18.3 

log(E/eV) 



Fig. 3. Ratio between the number of events whose real energies do not belong to Ho 
but the reconstructed one fall in n r and the number of events whose real energies 
belong to Ho but the reconstructed one fall in IL., F(E t q). The minimum is located 
at E r0 = 1-14 x 10 18 eV. 

reconstructed energies fall in n r is, 



fB(Ero) 



(l+S)E r0 

(l-S)E r0 

E% 



dE 



(1-S)E dN 

dE'-(E') G(E,E>)+ 
Ex dE' 



d, N 

/ dE' ^rr(E') G(E,E') 
J(i+s)e dE' K J v ' 1 



(16) 



Therefore, the value of E r0 for which the fraction of events belonging to n 
that fall in 11,. is maximum is obtained by minimizing the function F(E r0 ) = 
fB(E r0 )/fA{E r Q). Figure [3] shows F(E r0 ) for 5 = 0.25 from which can seen 
that it has a minimum but at an energy grater than 10 18 eV. The minimum 
is located at E r0 = 1.14 x 10 18 eV, then, IT r = [0.86, 1.43] x 10 18 eV. 

The studies under consideration require the simulation of the cosmic ray en- 
ergy spectrum in a large energy interval around 10 18 eV. The number of show- 
ers required to obtain different samples of good statistics is very large. This is 
a very difficult task because of the computer processing time and disk space 
needed. The number of showers is considerable reduced considering just the 
events whose reconstructed energies fall in Il r . Therefore, the distribution 
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E [EeV] 



Fig. 4. Distribution function of real energies for events whose reconstructed energy 
falls in II r for <5 = 0.25, 7 = 2.7, E 1 = 0.1 EeV and E 2 = 10 EeV. 

function of real energies corresponding to events whose reconstructed energies 
fall in IT r can be used to obtain the input energy for the simulations. This 
distribution is given by, 



dN r(i+S)Ero 
P R (E)=C—(E) dE G(E, E), (17) 

dE J(l-S)E r0 



where, 



rE 2 dN r(l+8)E r0 

C- x =\ dE^-(E) / dE' G(E, E'). (11 

J Ei dE J(l~S)E r0 



rE 2 dN r (l+S)E r0 

dE —(E) \ 

I Ex dE J(l-5)E r0 

Performing the integral in Eq. (II 711 the following analytical expression is ob- 
tained, 



P R (E) = C E^ 
where, 



\ V2SE J I V26E 



(19) 



Erf(z) = — [ Z dt exp(-t 2 ). (20) 
\/n Jo 



Figure 1 shows P R (E) for 5 = 0.25, 7 = 2.7, E 1 = 0.1 x 10 18 eV and E 2 = 
1 x 10 19 eV. The distribution of real energies is not symmetric with respect 
to 10 18 eV, it has a tail at large energies due fundamentally to the assumed 
constant relative error on primary energy. 
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3.2 Showers and detector simulations 



The simulation of the air showers is performed using the package Aires version 
2.8.4a [35]. Firstly, 8 sets of 20 independent samples are simulated correspond- 
ing to proton and iron nuclei as primaries, zenith angles 9 = 30° and 9 = 40° 
and two different hadronic interaction models, QGSJET-II [3"oTl3"T] and Sibyll 
2.1 [SB]- Each sample consists in 50 showers with primary energies obtained 
by taking at random 50 independent values from the distribution of Eq. (TT9"]) . 
In this way the effects of the spectrum and energy uncertainty are taken into 
account (see subsection 13.11) . Secondly, 8 samples of 50 showers each are also 
generated corresponding to proton and iron primaries, 9 = 20°, 25°, 35°, 45° 
and QGSJET-II as the hadronic interaction model. Also in this case the input 
energy is obtained by sampling the distribution of Eq. ffl9l . 

The simulation of the response of the Cherenkov detectors and the muon 
counters of the 750 m-AMIGA array is performed by using a dedicated package 
described in Ref. [39]. Muon detectors of 30 m 2 of area segmented in 192 
parts are used for the simulations. It is assumed a time resolution of 20 ns 
and the efficiency of each bar equal to one. Each shower is used 50 times by 
uniformly distributing impact points in the array area. The reconstruction of 
the arrival direction and core position of the events is performed by using the 
package of Ref. [40], specially developed to reconstruct the Cherenkov detector 
information in Auger. The muon lateral distribution function is reconstructed 
using the method introduced in Ref. [39] and the parameter X max is obtained 
following the method described in Ref. [39J , which takes into account the effect 
of the HEAT telescopes and the Reconstruction procedure of the longitudinal 
profile. 

In short, 20 independent samples of N = 50 x 50 = 2500 events (depending 
on the reconstruction efficiency) are obtained for each 9 £ {30°, 40°}, type of 
primary and hadronic interaction model. Besides, two samples of N = 2500 
events are also obtained for each 9 £ {20°, 25°, 35°, 45°} type of primary 
and QGSJET-II as the hadronic interaction model. 



4 Numerical approach 

4-1 Methodology and results 

The classification method considered in this work require the knowledge of the 
conditional density functions P(x \A). Therefore, the non-parametric method 
of kernel superposition [4T|42|43lT44] is used to estimate the probability den- 
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sity functions, from the simulated data, for the different set of mass sensitive 
parameters considered. Besides, the adaptive bandwidth method introduced 
by B. Silverman [4T] is also implemented in order to obtain better estimates 
of the density functions. 

The procedure starts by performing a first estimation of each density function 
using a Gaussian kernel with fixed smoothing parameter, 



P {x \A) 



N 



1 N 

ex p 

V| (27r) d / 2 h d Q £1 



[X - Xi 



2hl 



(21) 



where x is a d-dimensional vector corresponding to one of the sets of param- 
eters sensitive to the primary mass considered, N is the size of the sample, 
V is the covariance matrix of the data sample and ho = 1.06 x jV~ 1 '( (i+4 ) is 
the smoothing parameter corresponding to Gaussian samples which is used 
very often in the literature because it gives very good estimates even for non 
Gaussian samples. 

The following parameters are calculated by using the estimate obtained from 
Eq. fl2U), 

-1/2 

Po(xi \A) 



A; 



and then, the final density estimate is obtained from, 



l/N 



(22) 



P(x \A) 



NJ\V\ (27r) d / 2 



JV 

E 

i=l 



exp 



(x — Xi) T V 1 (x — Xi) 

2hl 



(23) 



where h~ = hn A,-. 



As mentioned, different type of mass sensitive parameters are used for the 
subsequent analyses: the depth of the maximum development of the shower, 
X max , the number of muons at 600 m from the shower axis, A^(600) and a 
set of parameters coming from the Cherenkov detectors, SD = {ii/2> P, R}, 
where ty2 is a parameter constructed from the individual rise-time of a subset 
of stations belonging to each event (see Ref. [39]), P is the slope of the lateral 
distribution function of the signal in the Cherenkov detectors and R is the cur- 
vature radius of the shower front. The following combination of parameters are 
considered: (i) (N„(d00),X max ), (ii) SD = (t 1/2 ,P,R), (Hi) (jV„(600),SX>), 



H (X max ,SD) and (v) All = (^(600), X maxi SD). 



Ten pairs of density estimates, {Pi(x \Pr),Pi(x \Fe)} with i = 1 ... 10 are 
obtained for each zenith angle (9 = 30° and 40°), hadronic interaction model 
and set of parameters. 10 of the 20 samples (see subsection [3]) corresponding 
to protons and 10 of the 20 samples corresponding to iron nuclei are used to 
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construct those estimates. The rest 10 proton samples and 10 iron samples for 
each zenith angle and hadronic interaction model are left as test samples. 



Therefore, 10 different estimates of the probability of proton given x are ob- 
tained (see Eq. ([T])), 

p.(x \Pr) 

PAPr \x) = n 1 J with i = 1 ... 10, (24) 

Pi{x \Pr)+Pi(x \Fe) 

assuming no prior knowledge. 

Each Pi(Pr \x) in combination with the 20 proton and iron test samples are 
used to obtain samples of the distributions functions, P pr {p pr ) and Pf e (p pr ), 
where p pr is the random variable obtained evaluating Pi(Pr \x) with a ran- 
dom vector x distributed following the proton or iron distributions (see Eq. 
(jBJ)). In this way, 100 samples of the distributions P pr {p pr ) and Pf e (p P r) are ob- 
tained for each zenith angle, hadronic interaction model and set of parameters 
considered. 

Figure [5] shows the average, with the corresponding errors, of the 100 his- 
tograms of the samples of the distributions P pr {p pr ) and Pf e (p pr ) correspond- 
ing to 9 = 30°, QGSJET-II and for the different combinations of parameters 
considered. It shows that the best event by event classification is obtained 
by using all parameters because the distribution for protons is the most con- 
centrated around one and the corresponding to iron nuclei around zero. The 
following best combination of parameters is (iV M (600), X max ) but not too far 
from the previous case. Although the combination (X max , SD) is better than 
(A^(600), SD), the improvement in the classification probability given by the 
addition of iV M (600) to the other surface detector parameters is very impor- 
tant because the duty cycle of the fluorescence detectors is ~ 10%, which 
means that the size of the data sample with available X max is ~ 10% of the 
corresponding to the surface detectors. Moreover, it is assumed, as an upper 
bound, 25% of energy uncertainty, presumably the energy uncertainty will be 
smaller, like in the higher energy range which is < 20%. In this case the clas- 
sification probability of the combination (A^(600), SD) will be equal or even 
better (depending on primary energy) than (X max , SD) (see Ref. [39|). 

Figure [5] also shows that the distribution of p pr for protons are more concen- 
trated around one than the corresponding to iron nuclei around zero. This is 
due to, for most of the parameters considerecO], the fluctuations corresponding 
to protons are larger, like in the simplified example of section [2j 



2 This is not the case for A^(600), for which, although the shower to shower fluctu- 
ations in the muon content are smaller for iron nuclei, when the energy uncertainty 
is included, they become of the same order or even larger than the ones for protons. 
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Fig. 5. Average distributions of p pr with the corresponding errors for protons and 
iron nuclei and for the different sets of parameters considered. The zenith angle is 
6 = 30° and the hadronic interaction model QGSJET-II. 



The classification probability for protons, iron nuclei and both together are 
estimated from (see Eqs. (llOlllip ). 
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E©(P^,i-V2) = ^, (25) 



./<• 



P i e =TT E©( 1 /2-^) = Tf , (26) 

Pd ~ N pr + N fe ' (27) 

where j?p r 4 is the probability of ith event, corresponding to a sample of primary 
type A = pr, fe and size Na, to belong to the proton class and ua is the 
number of events correctly classified. 

Figure [6] shows the distributions of P c i for QGS JET-II, 9 = 30° and each set 
of parameters considered. Consistently with the results obtained for distribu- 
tion functions of p pr , the highest classification probability is obtained using 
all parameters considered followed in decreasing order by (JV^(600), X max ), 
(X max , SD), (JV„(600), SD) and finally SD. 

Table [1] shows the medians and the regions of 68% of probability for the 
distributions of figure M and also for the corresponding to proton and iron 
classification probability. Although the distributions of p pr for protons are 
more concentrated around one than the corresponding to iron around zero, 
the classification probabilities for iron nuclei are in general grater than for 
protons. This happens because, the fluctuations of the proton parameters are 
in general larger than for iron nuclei (see example of section [2]). 

Table 1 

Medians and regions of 68% of probability for the classification probability distribu- 
tion of protons, iron nuclei and both together, for 9 = 30° and hadronic interaction 
model QGSJ ET-II. 

Parameters P^ e P c i 



All 


n Qi + - 02 

u - J1 -0.03 


Q2+ - 03 


92+ - 01 
u - JZ -0.02 


N ja -\- X max 




90+ - 04 
u - yu -o.oi 


0.91 ±0.02 


X m ax ~\~ SD 




0.89 ± 0.02 


0.88 ±0.02 


N^ + SD 


0.8<8 3 7 


0.86±8;8§ 


n oc:+0.03 


SD 


0.77 ±0.03 


o.8ot8:8 3 


0.78 ±0.01 



Figure [7] shows the medians and the region of 68% of probability for the 
classification probability distributions corresponding to the different combina- 
tions of parameters, 9 = 30° and 9 = 40° and for the hardronic interaction 
models QGSJET-II and Sibyll 2.1. Although the differences between the dis- 
tributions corresponding to QGSJET-II and Sibyll 2.1 are small, the medians 
corresponding to 9 = 40° and QGSJET-II are systematically larger than the 
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Fig. 6. Distributions of P c i for 9 = 30° and QGSJET-II as the hadronic interaction 
model for the different sets of parameters considered. 

corresponding to Sibyll 2.1. Besides, the classification probabilities are in gen- 
eral larger for 9 = 30°, this is due to the fluctuation of the parameters are 
larger for 9 = 40°. 

The effect of assuming a given hadronic interaction model as the true one 
whereas the real one is the other is also studied. For that purpose, 20 pairs 
{Pi(x \Pr), Pi(x \Fe)} with i — 1 ... 20 are constructed with the proton and 
iron samples corresponding to QGSJET-II for 9 = 30° and 9 = 40°. Then, 
the corresponding 20 proton samples and 20 iron samples, but generated with 
Sibyll 2.1, are used as test samples. In this way, the relevant distributions are 
obtained but assuming that QGSJET-II is the true hadronic interaction model 
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Fig. 7. Classification probability (medians and regions of 68% of probability) for 
the different sets of parameters considered, for 9 = 30° and 9 = 40° and for the 
hardronic interaction models QGSJET-II and Sibyll 2.1. 

whereas the real one is Sibyll 2.1 (case Q — S). The same strategy is repeated 
but in this case assuming that Sibyll 2.1 is the true hadronic interaction model 
whereas the real one is QGSJET-II (case S — Q), i.e., the density estimates 
are obtained using Sibyll 2.1 samples and QGSJET-II samples are used as test 
samples. 

Figure El shows the medians and the regions of 68% of probability of the classifi- 
cation probability distributions for the different sets of parameters considered, 
9 = 30° and 9 = 40° and for both cases, Q — S and S — Q. The classification 
probabilities, for the different combinations of parameters considered and for 
both zenith angles, result compatibles between Q — S and S — Q cases. This 
happens because the hadronic interaction models considered produce observ- 
able parameters that are quite similar between themselves. Besides, the values 
of the classification probabilities for the cases, Q — S and S — Q, are compati- 
bles with the ones obtained considering each one of those hadronic interaction 
models used to obtain both, the density estimates and test samples. 

Note that, the previous analyses are done by using samples of events which 
are not independent because each shower is used 50 times to simulate the 
response of the detectors (see, however, Ref. |45j). This fact does not affect 
the present calculation, see appendix [S] for details. 

We center here in the Auger enhancements. Consequently, we particularize 
our analysis for the specific properties of these detectors. Regarding detector 
resolution, we assume an energy error of 25%, which is much larger than 
the one obtained at present by the current Auger calibration with hybrid 
events (~ 18%). In the same way, the error in X max , AX max , is consistent 
with those obtained by the Auger fluorescence detectors (~ 20 g cm~ 2 at 
~ 1 EeV). The resolution on the determination of iV M (600) (< 12% for E > 
1 EeV) has been calculated in Ref. specifically for the muon counters 
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Fig. 8. Classification probability (medians and regions of 68% of probability) for 
the different sets of parameters considered, 8 = 30° and 9 = 40° assuming that 
QGSJET-II is the true hadronic interaction model whereas the real one is Sibyll 2.1 
(Q — S) and vice versa (S — Q). 

under consideration. Experimental uncertainties related with the simulation 
of the Cherenkov detectors or the reconstructed position of the core have 
been discussed extensively in Ref. jl6] and [31], respectively, and their results 
have been incorporated in the calculation of the discrimination probabilities 
presented here. 

Nevertheless, the muon counters are the only component of the system that 
has not been built yet beyond the prototype stage and, therefore, might be 
thought of as more uncertain. It can be shown that, by adding Gaussian 
fluctuations of magnitude a, from an unspecified origin and applied to each 
muon counter, the relative error in the determination of iV M (600) increases less 
than 5% for a < 25%, which would increase the relative error from 12% to 
17%. At this level, the error in energy, already included in our model, would 
still be dominant, leaving our conclusions unchanged. 



4-2 Zenith angle dependence 



The distribution functions of the different combinations of mass sensitive pa- 
rameters depend on zenith angle. Therefore, as shown in figure [7J the classifi- 
cation probability also depends on it. 

The parameter X max weakly depends on 9 but A^(600) do not. The simulations 
show that a new parameter, based on A^(600), but weakly dependent on zenith 
angle is obtained from, 



A> M (600) = iV M (600) 



cos 6 re f 
cos 9 



3/2 



(281 
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Fig. 9. Classification probability as a function of zenith angle corresponding to 
{^(600), X max } and hadronic interaction model QGSJET-II. The proton and iron 
nuclei density estimates of this pair of parameters used for the analysis correspond 
to QGSJET-II and 6 = 30° (see text for details). 



where 9 re f = 30° is a reference angle. In this way a pair of mass sensitive 
parameters, {iV M (600), X max }, that are almost constant with 9 is obtained. 
Therefore, the estimates corresponding to 9 = 30° can be used to obtain 
the classification probability as a function of zenith angle for this pair of 
parameters. 

The 20 pairs of proton and iron samples, corresponding to 9 = 30° and 
QGSJET-II, are used to construct the corresponding density estimates for 
x = (iV M (600), X max ). Then the single pairs of proton and iron samples cor- 
responding to 9 = 20°, 25°, 35°, 40°, 45° (just one pair of proton and iron 
samples of the 20 available for 9 = 40° is considered) is used to calculate the 
classification probability as a function of 9, see appendix [B] for details of the 
calculation. 

Figure shows the obtained classification probability as a function 9. The 
result corresponding to 9 = 30°, obtained in the previous subsection (see 
table [1]), is included in the figure. From this figure it can be seen that the 
classification probability is almost constant with 9 and the mean value is 
approximately 0.9. 
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5 Composition of a sample 



The determination of the composition of a sample of size N = N pr +Nf e , where 
N pr and Nf e are the number of protons and iron nuclei, respectively, consists 
in the estimation of the proton content, i.e., c p = N pr /N. For that purpose 
suppose that the classification probabilities obtained for a given set of pa- 
rameters are and P^ e for protons and iron nuclei, respectively. Therefore, 
the number of proton and iron events correctly classified follow the binomial 
distribution, 



P(n 



pr j 



P(n fe 



n 



pr t 



'pPr^N V r-n pr /-^ _ pp 



fe\ 



(29) 
(30) 



cl 
pr 



The number of events classified as protons or as iron nuclei are n\ 
Nf e —rif e and rij e = n/ e + N pr — n pr , respectively. By using Eqs. fl29| [301) to 
calculate the expected values of these quantities the following expressions are 
obtained, 



E\n\ 



I 



prl 



E[n%i 



N, 



pr 



ppr 



1 " Pll 



1 r cl r cl 



N, 



pr 



(31) 



Last equation suggests the following definition of estimators of Npr and Nf £ 



N pr 



»-l 



which, function of n pr and n/ e give, 



" 'pr 



(32) 



N fe P^ + N pr (P^-l) + n P r-n fl 



fe 



N 



pr 



ppr , p/e 



N 



fe 



NprP![ + N fe {P- 



pr 
cl 



n 



pr 



ppr , p 



fe 
cl 



(33) 
(34) 



Note that, by construction, E[N pr ] = N pr and E[Nf e ] = Nf e (the estimates 
are non-biased). Therefore, the composition estimator c p = N pr /N is also 
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non-biased and the variance, obtained from Eq. (1331) using (129]) and (130}) . is 



N y45^$2«. ,35, 



N { p d + pit - 1) 



Note that the variance is inversely proportional to the sample size. 

As shown in subsection 14.11 P^ and P c { e are random variable. This happens 
because finite samples of events are used as test samples as well as to construct 
the density estimates. Therefore, the distributions of and P^ e , obtained 
in subsection 14. 1\ correspond to samples of ~ 2500 events to construct the 
density estimates as well as for the test samples size. 

The lowest values corresponding to the region of 68% of probability of the dis- 
tributions of P£ and P c { e (see tabled] for 9 = 30° and QGSJET-II) are used to 
obtain an upper limit of the uncertainty in the determination of the compo- 
sition of a sample, cr[c p ] = Varldp} 1 ^ 2 . This approximation overestimates the 
error because, as mentioned, the classification probability distributions also 
include the fluctuations due to the finite size of the test samples. In fact, the 
standard deviation for these fluctuations is <jf s = [p(l — p)/N] 1 / 2 , where p 
is the classification probability of protons or iron nuclei corresponding to a 
given pair of density estimates. oy s < 0.01 for samples of ~ 2500 events and 
p = 0.7 — 0.9. From table [fl it can be seen that such fluctuations are com- 
parable to the total fluctuations. Test samples of much larger size are needed 
to obtain the classification probability distributions corresponding to density 
estimates built from samples of ~ 2500 events, without including such fluctu- 
ations. Note that also changing the number of events of the samples used to 
construct the density estimates would modify the fluctuations on the classifi- 
cation probability distributions, in fact, such fluctuations can be reduced by 
using samples of larger number of events. 

Figure [TD] shows a [c p ] as a function of c p for the different sets of parameters 
considered, 9 = 30°, QGSJET-II and for samples of 100 events, which is the 
number of hybrid events expected for the energy interval n r in two years of 
data taking of AMIGA and HEAT [31]. As mentioned, u[c p ] is calculated by 
using the lowest values of the region of 68% of probability corresponding to 
the proton and iron classification probability distributions. From this figure 
it can be seen that cr[c p ] varies very slowly with c p because the classification 
probability for protons and iron nuclei are quite similar (see table [T|). As 
expected, the larger the proton and iron classification probabilities the smaller 
the values of cr[c p }. 
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Fig. 10. cr[c p ] as a function of c p obtained by using the lowest values of region 
of 68% of probability corresponding to the proton and iron classification proba- 
bility distributions, for samples of 100 events, 6 = 30° and hadronic interaction 
model QGSJET-II. The curves correspond to: from bottom to top, All parameters, 
{N^600),X max }, {X rnax ,SD}, {N^(600),SD} and SD. 

6 Conclusions 



Under the assumption of a binary mixture of proton and iron nuclei, we study 
the possibility of assigning a statistically meaningful classification probability 
to individual cosmic ray showers in the ankle region of the spectrum. We use a 
non-parametric technique to estimate the relevant multi-dimensional density 
functions for the different combinations of parameters that will be available 
from the Auger enhancements AMIGA and HEAT. 

We find that, as expected, the maximum classification probability is obtained 
by combining simultaneously all surface and fluorescence parameters (~ 92% 
for QGSJET-II). However, the two leading parameters are {^(600), X max } 
which, used in a two-dimensional analysis, already give a classification prob- 
ability almost as large as the whole set of parameters combined (~ 91% for 
QGSJET-II). This means that, the separation between protons and iron nu- 
clei, is mainly expressed experimentally through these two parameters. 

We also study the effect of assuming QGSJET-II as the true hadronic in- 
teraction model (used to construct the density estimates) when the real one 
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is Sibyll 2.1 (used to obtain the test samples) and vice versa. We find that 
the classification probabilities obtained in either case, have systematic differ- 
ences which are much smaller than their statistical fluctuations. This happens 
because the hadronic interaction models considered produce similar air show- 
ers and, consequently, similar mass sensitive parameters. Nevertheless, other 
models, like EPOS [47J, which predict a very different shower muon content, 
would affect the composition estimates obtained with our technique. If the 
later were the possible way out might come from the application of 

a bi-dimensional technique, in which X max and information from hybrid 
events is used simultaneously in order to disentangle the hadronic uncertainty 
from the composition effects (see Ref. |34j). 

Although the values of the classification probability, obtained for the different 
sets of parameters considered, do not allow a reliable classification into proton 
and iron primaries on an event-by-event bases, the composition of a sample 
can be estimated with reasonable accuracy using this technique. It must be 
kept in mind, however, that any composition estimate inside this framework, is 
bounded by the simplifying assumptions of a binary mixture and of a hadronic 
interaction model that is well represented by either QGSJET-II or Sibyll 2.1. 
The limitation to a mixture of two components is not actually a limitation 
of our mathematical formulation, as expressed for example in Section [2], but 
a realization of the limitations posed by intrinsic shower fluctuations, detec- 
tor uncertainties and limitations and the lack of a precise knowledge of the 
hadronic interactions. Regarding the latter, the most widely used models in 
the literature are at present QGSJET and Sibyll in their latest versions, which 
we use in the present work. Even if the differences between these two models 
are not large enough as to render our results ambiguous, these models could 
be wrong. In fact, as already mentioned, the new model EPOS predicts a 
much higher number of muons for a given nuclei at any energy. Needless to 
say, these constraints apply not only to our own, but to any technique that 
attempts to determine composition, and they are important factors to which 
accelerator particle physics will certainly make a fundamental contribution in 
the near future. 
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A Effect of the non independence of the events 

The computer processing time and disk space required to obtain several sam- 
ples of showers with good statistics are very large. Therefore, each simulated 
shower is used 50 times, to increase the statistics, by uniformly distributing 
impact points in the array area (see section [3]). The simulated events generated 
in this way, corresponding to a given sample, are not independent. 

A pair of proton and iron samples of 1000 independent events each (showers 
after detectors simulations) are obtained by taking at random 50 independent 
events corresponding to 50 independent showers of each of the 20 proton and 
20 iron samples available for each zenith angle (30° and 40°) and hadronic 
interaction model considered. Then, the leave-one-out technique [26] is used 
to estimate the classification probabilities for each pair of proton and iron 
samples and combination of parameters considered. 

Table IA.1I shows the classification probabilities obtained for 9 = 30° and 
QGSJET-II for the different combination of parameters considered. Although 
the number of events corresponding to the proton and iron samples composed 
by independent events is ~ 2.4 times smaller than the one used for the analysis 
of subsection 14.14 the obtained values of the classification probabilities P^{ , 
P c { e and P d of table [O are contained in the region of 68% of probability of 
the corresponding distributions for the non-independent event samples (see 
tabled). 

Table A.l 

Classification probability corresponding to protons, iron nuclei and both together, 
obtained from two samples, one for protons an the other for iron, of 1000 indepen- 
dent events each, for 9 = 30° and high energy hadronic interaction model QGSJET- 
II. 



Parameters 


ppr 
r cl 


p/e 
r cl 


Pel 


All 


0.88 


0.92 


0.90 


+ X max 


0.88 


0.91 


0.89 


X m ax + SD 


0.84 


0.88 


0.86 


N^ + SD 


0.83 


0.84 


0.83 


SD 


0.74 


0.79 


0.77 



Similar results were obtained for 9 = 40° and for Sibyll 2.1 as well as for the 
cases in which one hadronic interaction model is used to construct the density 
estimates and the other for the test samples. For all these cases the results 
obtained for the samples composed by independent events are compatible with 
the corresponding to the samples composed by non-independent events. 



25 



B Calculation of the classification probability using a single test 
sample 



In this appendix details of the calculation of the classification probability and 
their uncertainties corresponding to the pair of parameters {iV M (600), X max } 
(see subsection |4.2[) are given. 

The pair of proton and iron samples of number of events N pr and Nf e , respec- 
tively, for a given zenith angle is used together with 20 pairs of proton and iron 
density estimates corresponding to 9 = 30°. In this way, 20 different values 
for protons and 20 for iron nuclei of the number of events correctly classified 
are obtained (see Eqs. (j2"5] I2"6"l |2"7]) ): n® r (s) and n° e (s) with s — 1, . . . , M and 
M = 20. 

The distribution function of the number of events correctly classified is bino- 
mial, therefore, including the uncertainty in the determination of the param- 
eter p of the binomial distribution, inferred from a given number of positive 
trials n A (s), the following expression is obtained, 

P(n A \N A ,n A (s)) = ( Na ) f\pp nA (l-p) NA - nA B(p\n A (s) + l,N A +n A (s)-l), 
\n A J Jo 

(B.l) 

where, 

B ^-wMk^-^ < B - 2 > 

is the Beta distribution and T(x) is the Gamma function. 
The distribution function of n A can be estimated from, 

i M 

fM = T7 E P ( n A\N A ,n° A (s)). (B.3) 



Therefore, the mean value and standard deviation of the classification proba- 
bility, P c f = n A /N A , are given by, 
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MN A 
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W A 



3=1 



M 



M 



J2(a[n A (s)] 2 + E[n A (s)] 2 ) - E[P t 



A]2 
cl\ 



1/2 



(B.4) 
(B.5) 



by using Eq. (IB.ip they become, 
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E[n A (s)} : 
a[n A (s)f ■■ 



N A 

N A + 1 1 

E[n A (s)](l - E[n A {s)}) + 

(N A -l)(N A -n A (s) + l)(n A (s) + l) 

(iV A + 2)2(iV A + 3) 



(B.6) 



(B.7) 



Finally, the mean value and the standard deviation of P c i are estimated from, 
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E[P ( 



cl\ 
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cl\ 
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,1/2 



M 



J2(a[n(s)} 2 + E[n(s)} 2 ) - E[P d ] 



8=1 



(B.8) 
(B.9) 



where N = N pr + N fe , E[n(s)\ = E[n pr (s)} + S[n/ e (s)] and cr[n(s)] 2 
a[n pr (s)f + (T[n pr (s)] 2 . 
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